Poker
No-Regret Strategy Solving in Imperfect-Information Games via Pre-Trained Embedding
Fu, Yanchang, Liu, Shengda, Xu, Pei, Huang, Kaiqi
High-quality information set abstraction remains a core challenge in solving large-scale imperfect-information extensive-form games (IIEFGs)--such as no-limit Texas Hold'em--where the finite nature of spatial resources hinders solving strategies for the full game. State-of-the-art AI methods rely on pre-trained discrete clustering for abstraction, yet their hard classification irreversibly discards critical information: specifically, the quantifiable subtle differences between information sets--vital for strategy solving--thus compromising the quality of such solving. Inspired by the word embedding paradigm in natural language processing, this paper proposes the Embedding CFR algorithm, a novel approach for solving strategies in IIEFGs within an embedding space. The algorithm pre-trains and embeds the features of individual information sets into an interconnected low-dimensional continuous space, where the resulting vectors more precisely capture both the distinctions and connections between information sets. Embedding CFR introduces a strategy-solving process driven by regret accumulation and strategy updates in this embedding space, with supporting theoretical analysis verifying its ability to reduce cumulative regret. Experiments on poker show that with the same spatial overhead, Embedding CFR achieves significantly faster exploitability convergence compared to cluster-based abstraction algorithms, confirming its effectiveness. Furthermore, to our knowledge, it is the first algorithm in poker AI that pre-trains information set abstractions via low-dimensional embedding for strategy solving.
- North America > United States > Texas (0.25)
- Asia > China > Beijing > Beijing (0.04)
- Research Report > Promising Solution (0.34)
- Overview > Innovation (0.34)
Dominated Actions in Imperfect-Information Games
Dominance is a fundamental concept in game theory. In normal-form games dominated strategies can be identified in polynomial time. As a consequence, iterative removal of dominated strategies can be performed efficiently as a preprocessing step for reducing the size of a game before computing a Nash equilibrium. For imperfect-information games in extensive form, we could convert the game to normal form and then iteratively remove dominated strategies in the same way; however, this conversion may cause an exponential blowup in game size. In this paper we define and study the concept of dominated actions in imperfect-information games. Our main result is a polynomial-time algorithm for determining whether an action is dominated (strictly or weakly) by any mixed strategy in n-player games, which can be extended to an algorithm for iteratively removing dominated actions. This allows us to efficiently reduce the size of the game tree as a preprocessing step for Nash equilibrium computation. We explore the role of dominated actions empirically in "All In or Fold" No-Limit Texas Hold'em poker.
- North America > United States > Texas (0.24)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada (0.04)
- Workflow (0.55)
- Research Report (0.50)
- North America > United States > Texas (0.05)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Quebec > Montreal (0.04)
KrwEmd: Revising the Imperfect-Recall Abstraction from Forgetting Everything
Fu, Yanchang, Yin, Qiyue, Liu, Shengda, Xu, Pei, Huang, Kaiqi
Excessive abstraction is a critical challenge in hand abstraction-a task specific to games like Texas hold'em-when solving large-scale imperfect-information games, as it impairs AI performance. This issue arises from extreme implementations of imperfect-recall abstraction, which entirely discard historical information. This paper presents KrwEmd, the first practical algorithm designed to address this problem. We first introduce the k-recall winrate feature, which not only qualitatively distinguishes signal observation infosets by leveraging both future and, crucially, historical game information, but also quantitatively captures their similarity. We then develop the KrwEmd algorithm, which clusters signal observation infosets using earth mover's distance to measure discrepancies between their features. Experimental results demonstrate that KrwEmd significantly improves AI gameplay performance compared to existing algorithms.
- North America > United States > Texas (0.25)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
- Asia > Japan (0.04)
- North America > United States > Texas (0.05)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > Texas (0.05)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > China (0.04)
- North America > Canada > Alberta (0.14)
- North America > United States > Texas (0.04)
- North America > United States > Texas (0.25)
- North America > United States > Rhode Island (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Outbidding and Outbluffing Elite Humans: Mastering Liar's Poker via Self-Play and Reinforcement Learning
Dewey, Richard, Botyanszki, Janos, Moallemi, Ciamac C., Zheng, Andrew T.
AI researchers have long focused on poker-like games as a testbed for environments characterized by multi-player dynamics, imperfect information, and reasoning under uncertainty. While recent breakthroughs have matched elite human play at no-limit Texas hold'em, the multi-player dynamics are subdued: most hands converge quickly with only two players engaged through multiple rounds of bidding. In this paper, we present Solly, the first AI agent to achieve elite human play in reduced-format Liar's Poker, a game characterized by extensive multi-player engagement. We trained Solly using self-play with a model-free, actor-critic, deep reinforcement learning algorithm. Solly played at an elite human level as measured by win rate (won over 50% of hands) and equity (money won) in heads-up and multi-player Liar's Poker. Solly also outperformed large language models (LLMs), including those with reasoning abilities, on the same metrics. Solly developed novel bidding strategies, randomized play effectively, and was not easily exploitable by world-class human players.
- North America > United States > Texas (0.24)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- (6 more...)
- Banking & Finance > Trading (1.00)
- Leisure & Entertainment > Games > Poker (0.48)
- Leisure & Entertainment > Games > Chess (0.46)
- Leisure & Entertainment > Games > Go (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- (2 more...)
Beyond Outcome-Based Imperfect-Recall: Higher-Resolution Abstractions for Imperfect-Information Games
Fu, Yanchang, Yin, Qiyue, Liu, Shengda, Xu, Pei, Huang, Kaiqi
Hand abstraction is crucial for scaling imperfect-information games (IIGs) such as Texas Hold'em, yet progress is limited by the lack of a formal task model and by evaluations that require resource-intensive strategy solving. We introduce signal observation ordered games (SOOGs), a subclass of IIGs tailored to hold'em-style games that cleanly separates signal from player action sequences, providing a precise mathematical foundation for hand abstraction. Within this framework, we define a resolution bound-an information-theoretic upper bound on achievable performance under a given signal abstraction. Using the bound, we show that mainstream outcome-based imperfect-recall algorithms suffer substantial losses by arbitrarily discarding historical information; we formalize this behavior via potential-aware outcome Isomorphism (PAOI) and prove that PAOI characterizes their resolution bound. To overcome this limitation, we propose full-recall outcome isomorphism (FROI), which integrates historical information to raise the bound and improve policy quality. Experiments on hold'em-style benchmarks confirm that FROI consistently outperforms outcome-based imperfect-recall baselines. Our results provide a unified formal treatment of hand abstraction and practical guidance for designing higher-resolution abstractions in IIGs.
- North America > United States > Texas (0.25)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)